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It  is  sometimes  argued  that  Bayesian  inference  is  unaffected  by  data 
dependent  stopping  rules.  Although  this  is  formally  true  for  ignorable  rules, 
there  is  likely  to  be  heightened  sensitivity  of  inference  to  prior  assumptions 
when  using  data  dependent  rules  rather  than  stopping  rules  that  do  not  depend 
on  the  data.  This  point  is  illustrated  in  a  simple  example. 
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SIHSXTXVXTV  or  bayes  inference  with  xgnorable  hot  data- dependent  STOPPING  ROUS 
Paul  R.  RoMnbns*  and  Donald  B.  Robin** 

1.  Comdex  Stopping  Rulaa  in  Appllad  Infarancai  Diaeuaaion  In  Tha  Context  of  An  Hxamnle. 

Complex  a  topping  rulaa — that  is,  complex  rulaa  for  dataraining  how  auch  data  to 
eollact  —  ara  oftan  difficult  to  avoid  in  practice.  On a  axaapla  occur*  in  tha  initial 
phaae  of  axperiaantation  to  dataraina  a  high  yat  tolarabla  doaa  of  a  new  cancer 
chemotherapy.  Tha  uaual  procedure  ia  to  offer  a  low  doaa  of  tha  new  drug  to  several 
patient*;  if  this  doaa  is  wall  tolerated;  tha  next  few  patients  receive  a  soaewhat  higher 
dose.  If  toxicity  is  Observed,  the  dose  ia  not  increased,  but  rather  several  additional 
patients  are  given  tha  current  dosei  based  on  these  additional  results ,  it  is  decided 
whether  to  terminate  the  experiment  or  continue  Increasing  the  dose.  Since  the  safety  of 
experimental  subjects  ia  a  primary  concern,  and  since  it  is  not  always  possible  to 
anticipate  the  nature  of  all  types  of  toxicity  that  might  arise,  rigidly  defined  stopping 
rules  are  often  Impractical.  Further  increasing  the  difficulties  of  statistical  modeling 
is  the  fact  that  resultant  sample  sixes  are  often  quite  small  (i.a.,  less  than  20).  Often, 
inferences  ara  based  on  statistical  procedures  that  ignore  the  stopping  rule  (e.g. ,  Brown 
and  Hu,  1981). 

Although  we  will  discuss  the  effects  of  stopping  rules  on  Bayesian  inferences  within 
the  context  of  phase  I  trials,  our  model  for  these  trials  la  simplified,  and  in  scam  ways 
artificial.  Consequently,  our  results- are  not  directly  applicable  to  the  practice  of  these 
trials,  but  are  mainly  suggestive  of  directions  for  further  work. 

write  Dj,  for  the  dose  given  to  patient  1  and  write  T^  for  the  vector  measure  of 
toxicity  observed  for  patient  i,  where  N  patients  (Indexed  by  1  -  1 , » .. ,N)  are  observed 
before  the  study  is  terminated.  Since  dose  04  depends  on  previous  doses,  D«1_1  - 
(d1,d3,...,d1_1)  and  previous  tomicitlea  **i-1  ■  (T1,T2,...,T1_1)  and  possibly  an  unknown 
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ptriMt«r  ♦  of  tho  does  escalation  rule,  we  writs  for  the  probability  of  Di  given 
<1>  **r(Di,D«i-l'  »*!_,*  ♦>  • 

Similarly,  since  the  index,  M,  of  the  last  patient  entered  into  the  trial  depends  on  the 
doses  and  tonicities  through  the  Xth  trial,  we  write  for  the  data  dependent  stopping  rule 
<2>  prdllD'g,  T^.t) 

where  y  is  a  possibly  unknown  parameter  of  the  stopping  rule.  Because  it  is  usually 
reasonable  to  assume  that  does  determines  toxicity  Tj,  we  complete  the  specification  of 
the  joint  distribution  of  M,  D*^,  by  writing 
(3)  pHTjD^D.^.T^.d)  -  p(Tt|Dt,*) 

where  •  is  an  unknown  parameter  that  relates  toxicity  to  does,  the  parameter  0  is  the 
unknown  parameter  of  primary  interest,  thus,  the  distribution  of  the  observed  data 
*N'D#ll'**lf)  f1™1  th*  unknown  parameters  (0,Y,0)  is  given  by 
<«»  I  pr(T1|Di,«)}  {  K  P>fCD1lD#1-1,*#1_1 . ♦) }  . 

the  frequency  properties  of  interval  estimates  of  0  generally  depend  on  the  factor 

II 

{pr(M|Da||,Tn|,Y)  1  pr(Di|O#lla1fT#i_1,0}}  of  the  distribution  (4)  which  may  be  ocaplex, 

involving  high  dimensional  unknown  paramsters  y  and  0.  Therefore,  small  sample 
frequentist  inferences  may  be  difficult  if  not  impossible  to  Obtain,  unless  carefully 
constructed  and  followed  stopping  rules  ere  used  (e.g. ,  Ami  tags  1975i  Pocock  1977;  DeMets 
•ad  Ware  1980i  Pinning  1902).  There  exists  the  hope,  often  stated  (e.g.  Lind ley,  1972, 
p.24),  that  Baysian  Interval  estimates  produced  by  Ignoring  the  stopping  rule  will  not  lead 
the  investigator  astray. 

Zf  0  and  (0,y)  are  a  priori  independent,  then  the  dose  escalation  and  stopping 
rules  are  ignorable  (Rubin,  1976,1970e,b)  in  the  sense  that  the  marginal  posterior 
distribution  of  0  can  be  obtained  by  sUply  Ignoring  the  factors  p(*|0^,T^,D  and 

*Di'D»i-rT*i*1'^'  (To  Mt  this,  note  that  the  posterior  distribution  factors  into 
a  term  involving  0  and  ona  involving  <0,t)»  nee  also  Rubin  (1970).)  zf  0  and  (0,y) 
are  a  priori  dependent,  then  in  general  Bayesian  inferences  about  0  explicitly  depend  on 
the  stopping  rule  or  the  dose  escalation  rule. 


Am  dependence  of  Bayesian  infaranea  on  nonignorable  axpariaantal  designs  is  wall 


known  (Rubin  19?8at  Rosanbausi  and  Rubin  1983).  In  this  paper  wa  study  the  possible 
sensitivity  of  Bayesian  inf erenee  even  to  ignorabls  stopping  rules,  and  for  this  purpose, 
we  sssums  that  8  and  (t.y)  are  independent .  Be  shall  see  that  Bayesian  inference  can 
depend  on  the  stopping  rule  in  subtle  ways  even  when  it  is  ignorable.  in  particular  we 
find  that  Bayesian  infarences  from  experiments  involving  a  priori  fixed  stopping  rules  will 
frequently  be  less  sensitive  to  eodel/prior  specifications  (i.e.  wore  robust  in  the  sense 
of  Box  and  Tiao(1962, 1964, 1973)  and  Dsmpstsr  (1977))  than  inferences  from  experiments  with 
ignorable  but  date-dependent  stopping  rules. 

Thus  we  see  that,  as  in  other  contexts  (e.g.  randomisation  in  experiments  (Rubin 
1978a i  Rosenbaum  and  Rubin  1963),  and  nonresponse  in  sample  surveys  (Rubin  1978b*  Little 
1982)),  an  inability  to  draw  frequency  inferences,  or  at  least  an  inability  to  do  so 
straightforwardly,  is  indicative  of  heightened  sensitivity  of  Bayesian  Inference  to  prior 
specifications. 

2.  A  8implo  Illustrative  Case 

Zn  order  to  focus  on  the  effect  of  the  stopping  rule  on  Bayesian  inference  we  consider 
•  particularly  simple  case.  Wa  bsgin  by  assuming  that  tha  dose  of  the  drug,  Dj,  remains 
constant  at  D,  so  that  for  the  escalation  rule  we  have 


(5) 


pr<Di|D*i-1,T*i-1,'W  " 


1  if  Dj-D 

0  otherwise  . 


Because  the  dose  is  constant  for  all  patients,  the  average  toxicity  at  the  fixed  doee  D  is 

a  natural  parameter  to  estimate.  Suppose 

1  , 

<«>  pr(T1lo1  -  d  ,•)  -  /ii  exptttj-WVa)  -  tfTj  -  9) 

so  the  toxiclties  are  normal  with  mean  9  and  variance  1,  where  #(•)  is  the  standard 

normal  probabllty  density. 

We  complete  the  model  specification  by  assuming  a  simple  stopping  rulet  100  trials 
will  be  conducted  unless  the  data  suggest  that  the  average  toxicity,  9,  is  too  high,  in 
which  case  the  study  will  be  terminated,  in  particular,  data  will  be  collected  to  the 


100th  trial  tmli 


ess  the  currant  t- statistic  testing  0*0  la  larger  than  C  where  c  la 
_  _  N 

a  constant,  that  is,  unless  T—  >  C//5  where  T  -  J  T.  /H. 

i-1 

Formally,  we  haw 

1  If  ■  -  100  aad  ?1  <  c//l,  for  all  1  <  100 
(7)  prWlD^.P^Y)  -  1  If  ■  <  100,  ^  >  C/4i,  and  ?t  <  C//T  for  all  1  <  M 


0  otherwise. 


Tha  aaoalatlon  aad  stopping  rulaa  defined  by  (S)  aad  (7)  arm  ignorable,  aad  moreover, 
arm  from  of  unknown  parameter a.  Coaaeqoently,  Bayeaiaa  iaferenoea  for  0  with  a  fixed 
prior,  p( 0) ,  follow  from  tha  prior  dlatrlbotloa  for  0  aad  tha  normal  specification  (6) i 
that  la  tha  poatarlor  dlatrlbotloa  of  0  la  proportional  to 


(8) 

Suppose 


p(0) 


*  ♦(*,“•) 

1-1  x 


p(  «>♦{(%-«)•£} 


(9)  p(0)  -  /p  +{(8-u)/p} 

ao  that  a  priori  0  la  normal  with  naan  a  aad  variance  1/p.  Than  tha  poatarlor 


dlatrlbotloa  of  6  la  normal  with  moan 

(10)  (pa  ♦  Wtm)/iP  *  M) 

aad  variance 


(ID  .  (P+W*1. 

3*  the  Standard  interval  for  0  and  Ite  Probability  Oovoraaa 

The  etandard  95%  Interval  for  0  under  tha  normal  specification  (6)  la 


(12) 


!(*_,*)  -  T  ±  -z 


for  a  priori  fixed  H,  i.a.  for  C  -  •  la  (7),  tha  Interval  1(^,1)  la  a  95%  confidence 
interval,  covering  8  in  ellghtly  more  than  95%  of  experimente.  For  ignorable  atopplng 


rulaa,  that  la,  for  8  that  dapenda  only  on  the  obaerved  data,  T«g,  aa  in  thia  example, 
tha  etandard  Interval  la  the  limit  of  a  sequence  of  95%  highest  posterior  density  intervals 
aa  tha  prior  variances  1/p  tends  to  ",  with  any  fixed  prior  mean,  h*  One 
justification  that  has  been  offered  for  thia  Interval  based  on  a  "flat  prior”  la  that  the 


lattml  is  in  scam  ssnss  conservative ,  adding  lass  prior  information  to  tha  infsronoo  than 
priors  which  ara  aors  psaksd.  Mots  that  tha  interval  corresponds  to  tha  ons  obtained  from 
Jeffery* s  (1961 )  noninforaative  prior  if  M  is  fixed  a  priori,  but  not  generally  if  other 
stopping  rules  are  used# 

Me  now  investigate  the  sensitivity  of  interval  estimation  to  variations  in  prior 
assumptions  about  6#  In  particular,  suppose  tha  prior  distribution  of  6  is  given  by 
(9).  Then  the  posterior  distribution  of  6  is  specified  by  (10)  and  (11). 

Hence,  the  posterior  coverage  of  the  standard  interval ,  KT^.H)  ,  is 

pr(6  e  [?  ±-£]  |M,  l^.D^.ii.p) 

"  /M 


1//M  ♦  P 


«.  +  w, 


MT  ♦  PM 

-J - )  * 

M  ♦  p  •  * 


£ 


(13) 


where  9{  }  is  the  standard  normal  probability  measure  and  *  *  T^  -  i».  In  words, 
F(M,th,p,p)  is  the  posterior  probability,  given  u,p  ,  that  0  falls  in  the  standard 
Interval,  I(TH,N). 

Inferences  about  9  are  relatively  insensitive  to  prior  specifications  if 
»(N,Th,ii,p)  is  approximately  .95  or  higher  for  values  of  (|i,p)  that  are  not 
contradicted  by  tha  data.  In  the  next  section,  wo  examine  the  frequency  properties  of 
9(B'*Nili«0)  for  various  stopping  rules,  and  we  find  that  fixed  sample  sises  yield  less 
sensitivity  than  some 


stochastic  stopping  rules 


1 


Some  preliminary  observation*  arm  posaible  baaed  on  inspection  of  expression  (13). 

.  For  every  fixed  t  “  -  w  and  M, 

P(N,TH,|i,p)  ♦  0{  i  2  )  -  .95  as  P  ♦  0| 

1. e. i  informally,  if  tha  prior  (9)  is  actually  diffuse,  then  the  standard  interval  will  be 
approximately  correct  regardless  of  the  stopping  rule. 

2.  For  every  fixed  p  and  N, 

P(M,Tg,u,p)  ♦  #{  *2/1+p/H  }  >  .95  as  t  *  Ot 
i.e.,  informally,  if  the  sample  mean,  T  ,  happens  to  be  close  to  the  prior  mean,  p« 
then  the  standard  interval  will  tend  to  be  conservative  regardless  of  the  stopping  rule. 

3.  If  we  stop  whenever  e//S  appears  large,  then  P(M,i^,|i,p)  will  be  more  likely  to 
fall  below  .95,  since  the  interval  will  tend  to  be  centered  away  from  sero.  this  fact  is 


the  basis  for  increased  sensitivity  under  certain  data  dependent  stopping  rules. 

4.  If  N  is  fixed  a  priori,  so  that  the  stopping  rule  is  not  data  dependent,  then 

c//i  | u,p,M  ~  M(0,  >  (i  ♦  i)).  Therefore,  *//*  may  be  expected  to  be  nearly  sero  if 
1  I  P 

the  prior  variance,  1/p,  of  •  is' small  end  M  is  not  too  small,  and  hence,  under  these 
conditions,  the  coverage  P(N,TM,w,p)  of  the  standard  95%  interval  will  often  be  close  to 
or  above  95*.  Observations  1  and  4  together  provide  a  basis  for  expecting  tha  standard  95* 
interval  to  have  approximately  95%  coverage  with  fixed,  large  sample  sisas. 


This  section  describes  a  simulation  which  shows  that  the  sensitivity  of  Bayesian 
inference  depends  in  pert  on  the  stopping  rule  used.  In  particular,  we  find  that  the 
marginal  distribution  given  y,p  of  the  posterior  ooverags  probability  P(N,TN,p,p)  of 
the  usual  95%  interval  may  be  less  tightly  concentrated  around  .95  if  certain  data 
dependent  stopping  rules  are  used. 

Seven  stopping  rules  are  compared,  all  using  the  same  distribution  of  5.  First, 

•  is  sampled  from  *(0,“).  Then  T1rT2, are  Independently  sampled  from  *(9,1). 
Four  data  dependent  stopping  rules  Ilev-**,  Slav- 2,  Kiev- 1.5  and  Slav-.  5  involve  stopping 
at  the  first  N  <  100  at  which  T  is  elevated.  In  particular,  rule  llev-C  is  given  by 
(7).  Of  course,  Blev-«*  fixes  the  sample  else  at  IM00. 


wov 


To  investigate  the  affect  of  data  dependant  stopping  rules,  it  la  not  suffielant  to 
compare  Slav-*  with  Klev-C,  for  C  ■  2,  1.5,  .5,  since  the  aarginal  distribution  of  sample 
sites  is  not  the  sane  for  the  four  stopping  rules.  Therefore,  we  also  sampled  using  three 
other  stopping  rules,  namely  Rand-C  for  C  ■  2,  1.5,  .5,  where  Rand-C  and  Elev-C  produce  the 
a asm  marginal  distribution  of  sample  aises  H,  but  under  Rand-C  the  sample  site  N  is 
conditionally  independent  of  T^/ZS  given  6.  The  Rand-C  rule  can  be  implemented  as 
follows i  Zf  one  observation  from  rule  Elev-C  yielded  a  sample  site  of  M,  then  Rand-C  alto 
stopped  at  sample  site  N,  but  5  was  calculated  from  H  new  independent  observations  from 

n<e,i). 

Table  1  displays  estimates  of  the  tot  point  of  the  sampling  distribution  of  Bayesian 
coverage  probabilities  F(N,TN,0,p)  of  the  standard  95%  Interval,  t  2/Vi. 
examination  of  this  table  leads  to  the  following  observations.  First,  as  one  would  expect, 
if  the  prior  distribution  of  •  is  diffuse  (p  -  .01),  the  coverage  of  the  standard 
interval  is  nearly  95%  for  all  seven  stopping  rules.  Beoond,  for  both  fixed  sample  sixes 
(Slav—)  and  purely  random  sample  sixes  (Rand-C) ,  the  estimated  10*  point  of  the 
distribution  of  coverage  probabilities  is  at  least  .93  for  all  prior  precisions  in  the 
table.  However,  this  estimated  10%  point  falls  as  low  as  .54  for  Blev-2.  In  other  words, 
it  appears  that  the  posterior  oeveraga  of  the  usual  Interval,  ±  may  be  lower 

than  55%  in  10%  of  experiments  if  the  *lev-2  shopping  rule  is  used.  Inferences  based  on 
the  standard  t  2/Vi  interval  are  lass  sensitive  to  prior/model  specifications  if  data 
dependent  stopping  rules  such  as  Blev-2  are  avoided. 

The  comparison  of  the  Blev-C  and  Rand-C  rules  in  Table  1  has  shown  that  it  is  not  the 
else  of  the  sample,  but  rather  the  reason  for  stopping  that  causes  the  Increased 
sensitivity.  Nonetheless,  given  that  an  Ilev-C  rule  was  used,  it  is  natural  to  aski  Can  we 
identify,  using  the  terminal  senple  else  M,  the  intervals  that  have  poor  coverage?  Figure 
1  addresses  this  question.  Twenty-five  independent  samples  were  drawn  using  Blov-2  and 
p-100,  and  the  coverage  probabilities  were  plotted  against  the  sample  sixes.  Fifteen  of 
the  twenty-five  samples  stopped  at  sample  sise  100i  of  these,  14  yielded  coverage 
probabilities  of  about  .99.  All  ten  samples  that  stopped  with  less  than  100  observations 


TABU  1.  estimated  10%  loinfe  of  the  Distribution  o f  Camtf*  Probabilitiss 


P(N,f^,0,p)  for  11m  Stopping  Kolos  and  Vorioos  Values  of  p. 


Prior 


Precision  P 

Prior  standard 

100 

10 

1 

.01 

STOPPING 

mu 

deviation  (1//p) 

.1 

.32 

1 

10 

■loo-  • 

.84 

.93 

.95 

.95 

Slav- 2 

.54* 

.78 

.91 

.95 

Rand-2 

.87* 

.92 

.94 

.95 

*lev-1 .5 

.80** 

.84 

.91 

.95 

Rand-1. S 

.86** 

.91 

.95 

.95 

Rlev-.S 

.93 

.92 

.89 

.95 

Rand-. 5 

•  Raaod < 

.89 

»  400  peaedoreplicetiona . 

on  000  penedoraplieationa. 

.87 

.93 

.95 

All  othor  values  oro  based  on  200  peaodoroplfeationa. 


Figure  1 
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had  coverage  of  Xaaa  than  .9,  with  tha  lowest  coverages  for  samples  with  less  than  ten 
observations. 
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I 


s 
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5.  Interpretation  of  Results. 

The  marginal  distribution  of  Bayesian  coverage  probabilities  that  was  simulated  in  the 
last  section  has  two,  not  really  contradictory,  interpretations!  a  subjectivist 
interpretation  and  a  frequency  interpretation. 

In  the  subjectivist  interpretation,  the  prior  distribution  on  6  reflects  personal 
beliefs  about  0  before  data  are  observed.  The  distribution  of  coverage  probabilities 
P(N,T,u,p)  is  relevant  during  the  design  of  an  experiment  if  the  standard  95%  interval 
(12)  will  be  used  in  the  analysis,  since  the  distribution  describes  personal  predata 
beliefs  about  the  final  postdata  coverage  of  that  interval.  If  for  practical  reasons  the 
standard  interval  is  to  be  used  in  the  analysis,  some  subjectivists  may  want  to  design  the 
experiment  so  that  this  interval  can  be  expected  to  be  at  worst  conservative  for  a  range  of 
reasonable  prior  distrubitions.  The  Kiev-**  and  Rand-C  stopping  rules  do  this  quite  well, 
whereas  Elev-2  runs  a  risk  of  substantial  undercoverage. 

In  the  frequency  interpretation,  the  current  experiment  is  viewed  (perhaps  accurately) 
as  one  in  a  long  aeries  of  experiments,  and  the  prior  distribution  of  0  is  the 
distribution  of  0  values  arising  in  these  experiments.  In  this  context,  there  is  a 
correct  Bayesian  inference  based  on  the  true  prior  distribution,  which  is  unknown  to  the 
experimenter.  The  experimenter  specifies  some  prior  distribution  s)— -often  flat  prior 
distributions— and  hopes  that  inferences  based  on  the  specified  prior  approximate  the 
correct  inferences  based  on  the  true  but  unknown  prior.  Our  simulation  illustrates  that 
this  hope  will  be  fullfilled  with  greater  frequency  if  rules  Elev-»  and  Rand-C  are  used 
in  place  of  Elev-2. 

He  have  examined  the  heightened  sensitivity  of  Bayesian  inferences  to  misspecification 
of  the  prior /model  when  date-dependent  stopping  rules  are  usedt  however,  our  investigation 
has  been  confined  to  a  highly  specialised  and  somewhat  artificial  case.  Further 
investigation  is  required  to  identify  situations  which  produce  more  or  less  sensitivity 
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